Super-Dirichlet Mixture Models Using Differential Line Spectral Frequencies for Text-Independent Speaker Identification
نویسندگان
چکیده
A new text-independent speaker identification (SI) system is proposed. This system utilizes the line spectral frequencies (LSFs) as alternative feature set for capturing the speaker characteristics. The boundary and ordering properties of the LSFs are considered and the LSF are transformed to the differential LSF (DLSF) space. Since the dynamic information is useful for speaker recognition, we represent the dynamic information of the DLSFs by considering two neighbors of the current frame, one from the past frames and the other from the following frames. The current frame with the neighbor frames together are cascaded into a supervector. The statistical distribution of this supervector is modelled by the so-called super-Dirichlet mixture model, which is an extension from the Dirichlet mixture model. Compared to the conventional SI system, which is using the mel-frequency cepstral coefficients and based on the Gaussian mixture model, the proposed SI system shows a promising improvement.
منابع مشابه
On von-mises fisher mixture model in text-independent speaker identification
This paper addresses text-independent speaker identification (SI) based on line spectral frequencies (LSFs). The LSFs are transformed to differential LSFs (MLSF) in order to exploit their boundary and ordering properties. We show that the square root of MLSF has interesting directional characteristics implying that their distribution can be modeled by a mixture of von-Mises Fisher (vMF) distrib...
متن کاملMinimum classification error training for speaker identification using Gaussian mixture models based on multi-space probability distribution
In our previous work, we have proposed a speaker modeling technique using spectral and pitch features for text-independent speaker identification based on Multi-Space Probability Distribution Gaussian Mixture Models (MSD-GMMs). We have presented a maximum likelihood (ML) estimation procedure for the MSD-GMM parameters and demonstrated its high recognition performance. In this paper, we describe...
متن کاملRobust text-independent speaker identification using Gaussian mixture speaker models
This paper introduces and motivates the use of Gaussian mixture models (CMM) for robust text-independent speaker identification. The individual Gaussian components of a GMM are shown to represent some general speaker-dependent spectral shapes that are efTective for modeling speaker identity. The focus of this work is on applications which require high identification rates using short utterance ...
متن کاملText-independent speaker identification using Gaussian mixture bigram models
In this paper, a novel speaker modeling technique based on Gaussian mixture bigram model (GMBM) is introduced and evaluated for text-independent speaker identification (speaker-ID). GMBM is a stochastic framework that explores the context or time dependency of continuous observations from an information source. In view of the fact that speech features are correlated between successive frames, w...
متن کاملSpeaker identification using Gaussian mixture models based on multi-space probability distribution
This paper presents a new approach to modeling speech spectra and pitch for text-independent speaker identification using Gaussian mixture models based on multi-space probability distribution (MSD-GMM). The MSD-GMM allows us to model continuous pitch values for voiced frames and discrete symbols representing unvoiced frames in a unified framework. Spectral and pitch features are jointly modeled...
متن کامل